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Abstract 

High frequency data provides a rich source of information for understanding finan- 
cial markets and time series properties of returns. This paper estimates models of 
high frequency index futures returns using 'around the clock' 5- minute returns that 
incorporate the following key features: multiple persistent stochastic volatility factors, 
jumps in prices and volatilities, seasonal components capturing time of the day pat- 
terns, correlations between return and volatility shocks, and announcement effects. We 
develop an integrated MCMC approach to estimate interday and intraday parameters 
and states using high-frequency data without resorting to various aggregation measures 
like realized volatility. We provide a case study using financial crisis data from 2007 
to 2009, and use particle filters to construct likelihood functions for model comparison 
and out-of-sample forecasting from 2009 to 2012. We show that our approach improves 
realized volatility forecasts by up to 50% over existing benchmarks. 
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1 Introduction 



Many important financial markets trade 'around the clock,' 24 hours a day from Sunday 
night to Friday close. This trading generates highly informative intraday prices and provides 
an important laboratory for studying the economics of trading, liquidity provision and mea- 
surement, and the mechanics of price discovery. These returns are also highly informative 



for forecasting volatility, modeling tail events and risk management (see, e.g., Andersen and 



Bollerslev, 1998; Andersen, Bollerslev, Diebold and Labys, 2003). 



Formally modeling intraday 'around the clock' returns is difficult and, in fact, rarely 
attempted due to the intricate structure of high-frequency returns, the model complexity re- 
quired to capture interday and intraday return dynamics, and the burdens generated by vast 
datasets. To see the first component, Figure [T] displays the mean absolute 5-minute returns 
on the S&P 500 E-mini futures during the day (intraday volatility), and the mean absolute 
returns each day from 2007 to 2012 (interday volatility). Within the day, return volatility 
has a complex periodic or seasonal structure, driven by the migration of trading through 



Asian, European and US trading hours and macroeconomic announcements (Andersen and 



Bollerslev, 1997, 1998). Across days, volatility is persistent, stochastic, and mean-reverting. 



Models capturing these components require complicated shocks and have many param- 
eters and latent states. Inference using large samples of high-frequency data is computa- 
tionally difficult. Because of this, most research aggregates intraday returns into realized 



volatility measures for estimation and model specification (see, e.g., Andersen and Benzoni 



2009 Barndorff-Nielsen and Shephard, 2007, for recent reviews). Realized volatility (RV), in 



its simplest form, is constructed by summing squared intraday returns. Most papers ignore 
intraday seasonality and the information in overnight returns, using only price data from 
'normal' trading hours, from 9:30 ET to 16:00 ET. It is also common to focus exclusively 
on total volatility, without specifying or estimating the remaining features of the return dis- 
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(a) Mean Absolute Returns by Period of Day (Intraday Volatility) 




Figure 1: Summary of five-minute returns on S&P E-mini futures, March 2007 - March 2012. (a) 
Mean absolute returns for each period of the day. The trading day runs from 18:00 ET-17:30 ET, 
with a break in trading from 16:15-16:30. Macroeconomic announcement times are marked with 
an 'x', and selected major market open and closing times are marked with vertical lines, (b) Mean 
absolute returns by date. 
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tribution, limiting the scope for financial applications that require a fully specified return 
distribution. 

Because of these hurdles, few authors have attempted to directly model high-frequency 



data. Some notable exceptions include Andersen and Bollerslev (1997, 1998), who model 



around-the-clock 5-minute Deutsche Mark/Dollar exchange rates using a long- memory GARCH 



model with seasonal and announcement effects (see also Martens, Chang and Taylor, 2002). 



Deo, Hurvich and Lu (2006) proposed a long-memory model with time- varying seasonal 



components to model 30-minute S&P 500 returns during regular trading hours. Engle and 



Sokalska (2012) implement a GARCH model on 10-minute return data for 2500 individual 
stocks with a seasonal component using third-party interday volatility estimates. 

To overcome these hurdles, we develop a highly tuned MCMC algorithm to estimate 
fully specified return models with high-frequency returns. These models, more general than 
any in the literature to date, incorporate seasonal components (capturing intraday volatil- 
ity patterns), multiple persistent 'multiscale' stochastic volatility (SV) factors (capturing 
slow-moving interday and fast-moving intraday volatility components), macroeconomic an- 
nouncement effects, randomly arriving jumps in prices (capturing outliers and unexpected 
news announcements), asymmetries via leverage effects, and fat-tailed return shocks. We 
use particle filters to construct filtering distributions for model evaluation and forecasting. 

Our approach is fundamentally different from and provides several advantages relative to 
the literature. First, our models chacterize the entire return distribution, in contrast to RV 
models that typically focus only on the second moment of returns, and provide flexibility 
to analyze returns at different frequencies via the model-implied time-aggregation. This 
provides the means to forecast volatility or any portion of the return distribution at different 
frequencies. Second, we estimate all of the parameters and states simultaneously. Prior 
work incorporating seasonality into formal models used two-stage estimation procedures 
laden with restrictive assumptions. Third, there is no need to fit separate forecasting models 



as is common in the literature (see, e.g., Andersen et al. 2003 Shephard and Sheppard 



2010). Fourth, our Bayesian approach allows us to quantify estimation risk and parameter 



uncertainty, which are important for financial applications. Finally, by not aggregating 
intraday returns into daily realized measures, we efficiently use all data, including overnight 
periods, while accounting for seasonal and announcement effects. 

Our empirical case study estimates models using 5-minute return data during the financial 
crisis, from March 2007 to March 2009, and forecasts from 2009 to 2012 conditional on 
parameter estimates. These are particularly turbulent times that are especially important 
to model and understand, and, from a practical perspective, the dramatic volatility changes 
highlight the need for accurate forecasts. We examine smoothed state estimates from our 
models during the height of the financial crisis in September and October 2008 and conduct 
an extensive out-of-sample forecasting study showing the strong predictive ability of our 
approach and the substantial improvements relative to existing approaches. 



2 Modeling and estimation approach 
2.1 Stochastic volatility models 

We assume that 5-minute logarithmic price returns, yt, follow the model 

Vt = 100 • log ( - Aj = fj, + v t e* t + J t Z y t , 

where P t are the futures prices, /i is the mean return, v t is the diffusive or non-jump compo- 
nent of total volatility, J t is an i.i.d. Bernoulli jump indicator variable with J t ~ Bern(K), 
Z\ ~ N(jLy, <jy) are the jumps in returns, and e* t are t u (0, 1) random variables. The errors 
are written as a scale mixture: e* t = y/X~t£t, where A t ~ XQ (y/2,v/2) is an i.i.d. mixing 
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variable and e t is i.i.d. standard normal. 

At this level, the model resembles common parametric SV models with jumps that provide 
an accurate fit to daily index returns and are useful for various applications. There is also 
strong nonparametric evidence for jumps and SV from intraday data, see, e.g., |Andersen 



and Shephard (2009) for a review. Beginning with Barndorff-Nielsen and Shephard (2002 



2004), there is a large literature developing nonparametric tests to identify jump components. 



Chib, Nardari and Shephard (2002) and Jacquier, Poison and Rossi (2004) first estimated 



SV models with scale mixture errors. 



We assume a multiplicative model for the latent volatility process: 



v t = a ■ I u ■ X t o ■ S t - A t 



where X t) i and X t ^ are SV processes, St is the seasonal component, and A t is the announce- 
ment component. Each factor scales up or down the other factors, and a can be interpreted as 
the modal volatility when the factors are at their baseline levels, X t) \ = X t> 2 = St = A t = 1. 
It is useful for estimation purposes to express the model as 



h t = /J>h + %t,i + %t,2 + s t + a tl 



(2) 



where h t = log(vf),n h = log(a 2 ),x M = log(X^), s t = log(Sf), and a t = log(Af). Here h t 
represents the total log-variance, and /i/j is the log-variance when the other components are 
at their baseline levels of x t ,i = x t .2 — s t — a t — 0. 

The volatility factors evolve stochastically via 



xt+i,i = + ViVt,i and x t+lj2 = (j) 2 Xt,2 + °2Vt,2 + JtZ } 



v 

t i 



where r) tj i ~ A^(0, 1) are mutually uncorrelated white noise sequences, J t is the same jump 
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arrival process as returns, and ~ J\f(ji v ,a%) are the jumps in log volatility. This allows 
the fast volatility to jump, while the slow factor diffuses over time. We also allow for an 
additional leverage effect, p = cor r(e t , 77^2), to capture the contemporaneous correlation 
between the shocks in returns and fast volatility. 

Jumps in volatility play a crucial role in capturing market moves in financial stress periods 
such as the crash of 1987, the Asian crisis in 1997, the Long Term Capital Management crisis 



in 1998 and the attacks of 9/11/2001 (see, e.g. Eraker, Johannes and Poison, 2003). More 



recent research focuses on intraday returns and also finds strong evidence for volatility jumps 



e.g., Todorov, 2011). As seen below, jumps in volatility play a crucial role in fitting high- 



frequency volatility during the Financial Crisis in 2008-2009. 

The model exhibits "multiscale" SV. We assume < 02 < 0i < 1, which identifies 
X t) \ and X t) 2 the 'slow' and 'fast' moving volatility factors, respectively. X t) \ captures the 
high persistence of volatility documented at daily and lower frequencies, while X t>2 is a 
volatile, rapidly moving factor capturing bursts of higher frequency volatility. Both factors 
are affected by intraday shocks, relaxing a common assumption that non-periodic volatility 



is constant intraday (see, e.g. Andersen and Bollerslev, 1997, 1998) 



We assume s t and a t are deterministic. For the seasonal component, let (3 = (/3 1; . . . , #288 )' 
denote the log-variance effects for each 5-minute period, assuming Yul=i @k — 0. Define 
Ft = (Fti, . . . , Ft ) 2s&)' as an indicator vector with fcth component equal to one if time t 
occurs at period of the day k and zero otherwise. The seasonal component is s t = F[fi. We 
assume a piecewise cubic smoothing spline prior for the /9fc's, which allows for discontinuities 
at market open/closing times when jumps occur (periods 1, 25, 109, 187, 265, 271). Following 



Wahba (1978) and Kohn and Ansley (1987), the prior can be written as (3 ~ J\f(0,TgU s ), 



where r s 2 is a unknown smoothing parameter and U s is a known correlation matrix (see 
Appendix C). The model can then be written in state space form, facilitating the use of 



the forward filtering backward sampling (FFBS) algorithm. (See also Weinberg, Brown and 



Stroud, 2007, for a similar approach using call center data.) 

To model a t , assume the market requires some 'digestion' time, when volatility determin- 
istically increases for K 5-minute periods after an announcement. We assume K = 5, thus 
digestion lasts 25 minutes, but the majority of the impact occurs in the first few periods. 
For each announcement type i = 1, . . . , n, let = (aca, . . . , a i5 )' denote the log-variance 
effects in the first five periods after an announcement, and let H ti = (H t n, . . . , H U5 )' be 
an indicator vector with kth component equal to one if an announcement occurred at pe- 
riod t — k and zero otherwise. Then we can write a t = H' t a, where a = . . . , a' n )' and 
H t = (H' tl , . . . , H' tn )' . We assume cubic smoothing spline priors, a, ~ A/"(0, T^U a ), where 
is an unknown smoothing parameter and U a is a correlation matrix (see Appendix C). We 
consider the n = 14 announcements types listed in Appendix G. Sunday open is treated as 
an announcement, as it is not periodic on the daily frequency. 

2.2 Estimation approach 

We take a Bayesian perspective and use MCMC for posterior simulation. Denoting z t = 
{%t,i, %t,2, At, Jt, Zf , Z%), and z T = (zx, zt), the joint posterior distribution is 

T 

p {z T , P, a, 9*\y T ) oc H p(y t \z t , (3, a, 9*)p(z t \z t ^, 6*)p{(3\6*) p(a\d*)p(6*) 
t=i 

where 9* = (p,, [Ah, <f>i, ax, <p2, C2> p, v, K, fM y , a y , p, v , a v ,r a , r s ) are the static parameters and 
y T = (yi, yrp). The priors and details of the MCMC algorithm are given in Appendix 
D. We use standard conjugate priors where possible and in all cases proper priors. 

Our algorithm is highly tuned using a number of useful representation and sampling 
'tricks.' For the SV components, we express the model as a linear, but non-Gaussian sys- 



tem and use the Carter and Kohn (1994) and Fruhwirth-Schnatter (1994) forward- filtering 
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backward sampling (FFBS) algorithm for block updating, an approach first used for SV 



models in Kim et al. (1998) (see also Omori, Chib, Shephard and Nakajima 2007). In some 



cases, we draw the parameters and states together. We use the Kohn and Ansley (1987) 
representation to express the cubic smoothing spline as a state space model and update the 
parameters and smoothing parameters in a block. 

Efficiently programmed in C, the MCMC algorithm takes 12-25 minutes on a 2.8 GHz 
Xeon processor to perform 12,500 iterations for each year of 5-minute returns (around 70,500 
observations), depending on the model. Computing time is approximately linear for the 
sample sizes considered. This implies our approach could be used in real-time, with, for 
example, parameters estimated daily, using particle filters in real-time to filter and forecast 
throughout the trading day. 

Particle filters provide approximate samples from p^z t |y*,0j, where 6 is a posterior 
measure of location for the parameters. The model structure allows us to use variations of 



the auxiliary particle filter (APF) of Pitt and Shephard (1999), which is more efficient than 



the original Gordon, Salmond and Smith (1993) algorithm, as it propagates high likelihood 
particles and is adapted to new observations. The details are given in Appendix E. 



At this stage, it is useful to contrast our estimation approach to that of |Andersen and 



Bollerslev (1997, 1998), who use a two-step procedure to first estimate daily volatility, which 



is assumed to be constant during the day, and then use a flexible parametric model to 
extract the seasonal component. We estimate all of the parameters and state variables at 
once, avoiding the need for potentially inefficient multi-stage estimators and restrictive model 
assumptions like normally distributed shocks and the absence of jumps. Another approach 
first aggregates intraday returns into a daily RV statistics, and then uses these statistics to 



estimate the model at a daily frequency (see, e.g., Barndorff-Nielsen and Shephard, 2002 



Todorov, 2011). We estimate the models directly on 5-minute returns, without aggregating 



to RV, which allows us to identify intraday components and forecast at high frequencies. 



2.3 Decompositions and Diagnostics 

Decomposing variance and comparing models is straightforward using MCMC output and 
particle filters. Consider the general model in equation |2j To quantify relative importance, 
we compute the posterior mean for the total log variance and for each variance component at 
each time period, e.g., x t ,i = E \x t ,i\y T1 \ , where y T = ■■■yr), run univariate regressions 
of the form h t = &o + &i^t,i and report the R 2 for each variance component as a measure 
of the fraction of total variance. We have experimented with other methods, and they give 
similar results. We report decompositions in both log-variance and in volatility units. 

A common metric for model comparison is the Bayes factor, Bjj = P [.M /P [.Mj|y*] , 
where {Mi} i=1 are the models under consideration, P[.Mj|?/] oc p (y t \Mi) P (Mi), and 
p(y t \Mi) is the marginal likelihood. It is not possible to directly compute marginal likeli- 
hoods directly for each time period, as this requires fully sequential parameter estimation. As 
an alternative, we report log-likelihood and Bayesian Information Criterion (BIC) statistics, 
the latter of which is an approximation to the Bayes factor. 

The likelihood of the observed sample in model Mi is 

T-l 

c( y T \e ii) ,M i ) = nH^+il^W,M), 

t=o 

where 9^ are the parameters in Aii, p (yt+i\8(i), y f , AAi) is the predictive return distribution, 

V (Vt+x\9{i),y\ Mi) = j " p (y t+ i\9(i), z t+ i,Mi) p (z t+1 \6(i), y\ Mi) dz t+1 , 

p (yt+i\9(i), Zt+i, Mj) is the conditional likelihood and p (z t +i\9(i), y*, Ma is the predictive 
distribution of the states. Here, z t +i = (xt+i,i, x t+i,z, At+i, Jt+i, Zt+i-> ^t+i) and 6 1 = (9*, (3, a). 
It is straightforward to use approximate samples from p (ztly*, 9^,M t ^j to generate approx- 
imate samples from the predictive distributions and the predictive likelihoods. All of these 
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distributions can be computed at 5-minute frequencies, as well as lower frequencies such as 
hourly or daily. 

Defining the dimensionality of 6^ as di in model Mi, the BIC criterion is 



BIG (Mi) = -2 log C y 1 \9 {i) ,Mi + d t log (T) . 



BIC and Bayes factors are related asymptotically BIC (Mi) — BIC (Mj) ps — 21ogSf-, 



where Bjj is the Bayes factor computed using data up to time t (jKass and Raftery[ 1995). 



BIC provides an asymptotic approximation in T to the posterior probability of a given 
model. Given our sample sizes, this approximation should perform well. Bayes factors are 
often called an "automated Occam's razor," as they penalize loosely parameterized models 



(Smith and Spiegelhalter , 1980). Lower BIC statistics indicate better model fit. 



The dimensionality or degrees of freedom are not preset for the splines, but are determined 
by the degree of fitted smoothness. We compute the degrees of freedom using the state-space 



approach of Ansley and Kohn (1987), evaluating the degrees of freedom at each iteration of 



the MCMC algorithm and using the posterior mean for model comparisons. 



3 Empirical results 
3.1 Data 

We obtained 5-minute tick data from a high-frequency data vendor from March 11, 2007 to 
March 9, 2012, consisting of 352,887 5-minute observations for 1293 trading days. We use 
the first two years, March 11, 2007 to March 8, 2009, for in-sample parameter estimation and 
the remaining three years for out-of-sample forecasting. March 2007 was a natural starting 
date as this coincided with a dramatic increase in 24-hour trading volume. 
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Table 1: Mnemonics for the stochastic volatility models that we consider. Here, i = 1 or 2. 



E-mini S&P 500 futures open trading at 18:00 on Sunday nights and trade until 16:15 
Friday evening (all times are Eastern). The market closes Monday-Thursday from 16:15- 
16:30 and from 17:30-18:00. The price data is for specific quarterly futures contracts, which 
are converted to a 'continuous contract' by rolling to the next expiration two weeks prior to 
expiration. This accounts for the gap between futures prices of different maturities. Weekly, 
the 'open' return is Friday close to Sunday at 18:00. Similarly, there are 'open' returns 
from 16:15-16:30 and 17:30-18:00. The seasonal components of the model capture potential 
increases in volatility for these periods. On average, there are 279 return observations per 
day. 

S&P 500 futures are one of the most liquid contracts in the world, thus microstructure 



effects are limited. Prior research (e.g., Corsi, Mittnik, Pigorsch and Pigorsch, 2008) finds 
that 5-minute E-mini returns are free from significant microstructure noise and offers a 
realistic compromise between sampling as frequently as possible and avoiding microstructure 



effects; see also Ait-Sahalia, Mykland and Zhang (2005). 



3.2 In-sample model fits 

We estimate a range of different model specifications, and Table [T] describes the special cases 
considered and provides mnemonic for the models. Table [2] reports in-sample fit statistics 
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Table 2: Degrees of freedom, Log-likelihoods, BIC statistics, approximate log Bayes Factors (from 
the base SVi model), for March 2007-March 2009. 

including the degrees of freedom, log-likelihoods, and BIC statistics. To ease comparisons, 
Table [2] reports approximate Bayes factors based on the difference of BIC statistics relative 
to the SVi model, -2\ogB ijSVl = BIC (Mi) - BIC (Msvi)- Better fitting models have 
higher likelihoods and lower (more negative) BIC statistics, quantifying the improvement 
over a single-factor SV model and the performance relative to the best fitting two-factor 
model. We do not separately report the single-factor model fits or parameter estimates 
as they generally performed poorly and for a given specification (e.g., SV with t-errors), 
the two-factor models always provided better in-sample and out-of-sample fits than their 
single-factor counterparts. 

The degrees of freedom range from 253 to 284, as indicated by d in Table [2j This consists 
of traditional unconstrained 'static' parameters d* (from 4 to 12) and the spline 'parameters' 
or degrees of freedoms, d s and d a , which are less than the number of knot points (288 
and 70, respectively) and are determined by the degree of smoothness of the fitted splines. 
In some cases, more complicated models have fewer degrees of freedom than their simpler 
counterparts (e.g., d = 268 for SV 2 vs. d = 253 for SVCJ 2 ), despite the fact that more 
complicated models have more static parameters. 

Overall, the multiscale, two-factor SV models perform best and in all cases, the BIC and 
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Figure 2: (a) Cumulative log Bayes factors during the estimation period, March 2007-March 2009. 
(b) Cumulative log-likelihood ratios during the forecast period, March 2009-March 2012. Values 
are relative to the base SVi model, and are multiplied by -2, so lower values indicate better fit. 

log-likelihood statistics provide the same relative conclusion, indicating that the parameters 
are accurately estimated, which is not a surprise given the sample sizes. The best in-sample 
performing models have leverage effects and allow for outlier movements, via either jumps 
or t— distributed shocks, which are needed to fit the fat-tails of high-frequency returns. In 
terms of BIC statistics, the SVt 2 model provides best in-sample fit, with the SVCJ 2 model 
providing the next best fit. 

We also estimated a number of GARCH models including a simple GARCH(1,1) model 
(GARCH), a GARCH(1,1) with t-errors (GARCH-t), and a threshold GARCH model with t- 



errors (TGARCH-t). All of these models contain seasonal components, fitted as in Andersen 



and Bollerslev (1997). Compared to the GARCH models, the multiscale SV models provide 
a substantial improvement. Thus, there are large benefits from using the more complicated 
SV model over the simpler and commonly used GARCH specifications. 
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It is also instructive to monitor model fits sequentially over time, as in West (1986) (see 



also Johannes, Poison and Stroud, 2009, in the context of stochastic volatility jump models). 
This provides a visual assessment of how models fail, either abruptly or via small errors that 
accumulate over time. Figure [2^, reports in-sample sequential Bayes factors for each model 
relative to the SVi model, thus it displays BIC (M.i) — BIC (M.sv x )- In terms of cumulative 
fit, the SVCJ2 and SVt2 perform the best over the period, with their cumulative advantage 
growing consistently over time, indicative of a general improvement in fit. 

Although most of the time series variation is small, there are a few visible spikes, most 
notably the sharp downward spike on 10/24/2008 (indicating a good fit relative to the SVi 
model). The cause is a sequence of consecutive zero 5-minute returns. This was not a 
data error, as we first assumed, but was generated by a circuit breaker locking S&P 500 
futures limit down from 4:55 am to 9:30 a.m. Exchange rules mandate that S&P futures 
can not fall by more than 60 points overnight and trading can occur at prices above, but 
not below, this level until 9:30. This generated relatively large likelihoods, as the predicted 
mean is effectively zero, and models with fast-moving volatility were able to reduce their 
predictive volatility quickly, thus the relatively good fit during this event. Of course, a 
complete specification would incorporate a mechanism for locked-limit down markets. 

The previous results were for models containing both the seasonality and announcements. 
We also fit all of the models without seasonal and/or announcement components, which were 
not reported to save space. Overall, the announcement components provide a significant, 
though relatively minor improvement to fit, which makes sense, given the relatively small 
number of announcements per week. The seasonal components are very important, dwarfing 
the impact of the announcement effects. 
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3.3 Parameter estimates and variance decompositions 



Table [3] summarizes the marginal posterior distributions for the parameters and reports 
inefficiency factors and acceptance probabilities for the slowest mixing component, a±, for 
the multiscale models. The MCMC algorithms generally mix quite well given the large 
number of unknown states and parameters, although models with jumps in volatility mix 
more slowly than those with diffusive volatility. This has been previously noted in the 



literature by Eraker et al. (2003). We do not report the single-factor parameter estimates, 
given their relatively poor fits. 

The estimates reveal a number of interesting results. As alluded to earlier, the SV factors 
correspond to a traditional, slow-moving interday factor and rapidly moving high-frequency 
factor. The point estimate of <\>\ in the best fitting models is 0.9999, corresponding to a 
daily AR(1) coefficient of 0.9725 and a half-life (log 0.5/ log0i) of almost 25 days. These are 
consistent with the literature estimating SV models using daily data. The other volatility 
factor operates at a very high-frequency with a 5-minute AR(1) coefficient <p2 of 0.926 to 
0.958, corresponding to a half-life of around an hour. The second volatility factor is also 
highly volatile (02 ^ &x)- Together, this supports an extreme form of multiscale SV that 
would be difficult if not impossible to detect using daily data. 

The slow-moving factor is far more important to overall fit than the fast-moving factor. 
To see this, consider the unconditional variance of each of factor: 

2 ^ + K (pi + - wl) 



1 1 A2 ' 



where k ^ for the SVCJ2 model, and posterior means and standard deviations for 7$ are 
reported in Table [3j T\ is more than 2 larger than r 2 , indicating that even though x tj % has a 
low conditional volatility, the process x t) \ varies substantially through the sample and to a 
much greater extent than x^%. 
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Table 3: Posterior means and standard deviations (in parentheses) for the two- factor models. The 
bottom two rows are the Metropolis-Hastings acceptance probabilities and inefficiency factors for 
the slowest mixing parameter, o\. 
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Table 4: Volatility decomposition (percentage of total), March 2007-March 2009 



Table [4] provides variance and log- variance decompositions. The interday factor explains 
more than half of the total variance (in levels or logs). The second factor plays a relatively 
minor role, explaining about 7% of the total variance, but plays a more prominent role when 
allowed to jump. Although the second volatility factor tends to explain a small portion 
of the overall variance in most models, it plays a very important role in specification as it 
eliminates a tension present in single-factor models. The SV factor in single-factor models 
tries to fit both low and high-frequency movements, ending up somewhere in between, and 
providing a poor fit to both intraday and interday return volatility. For example, in the SVti 
model, the estimate of 4>i is roughly 0.997, corresponding to a daily autoregressive coefficient 
of 0.4325 and a half-life of about 0.80 days, which is much slower than the fast factor and 
much faster than the slow factor in two-factor models. Thus, single factor models have a 
difficult time fitting both the high and low-frequency movements. 

The other parameters are largely as expected. The estimates of p are between -0.095 and 
-0.129, implying a modest leverage effect. Identifying this parameter using RV is difficult 



due to various biases (see, e.g., Ait-Sahalia, Fan and Li, 2012). The estimate of v is about 
20, indicative of modest non-normality, and consistent with previous daily estimates (e.g., 



Chib et al. 


2002 


Jacquier et al. 


2004 



Time- variation in the variance components accounts for most of the non-normality in models 
without jumps. Mean jump sizes, fi y , are close to zero in the SVCJ 2 specification, and 
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Seasonal Effects 
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Figure 3: Posterior means and 95% intervals for the seasonal effects f3 = . . . , /?28s) for the SVt2 
model. Results are shown on the standard deviation scale S = exp(/3/2). 

arrivals are frequent with k = .003 corresponding to at a rate of 0.84/day. Return jumps are 
relatively large as a y is much larger than the modal (non-jump) volatility, e.g., a y = 0.331 
vs. cr = 0.060 in the SVCJ2 model. The jumps in volatility are quite large. 

To quantify the seasonal fits, Figure [3] summarizes the marginal posterior distribution 
of S t = exp(s(/2). Recall that S t = 1 corresponds to average 5- minute return volatility, 
thus an overnight value of St = 0.5 implies that seasonal volatility is roughly half of average 
volatility. St spikes to more than 2.5 at the open and close of U.S. trading, and there is a 
clear 'IT shaped volatility pattern during U.S. trading hours. St fluctuates by a factor of 
more than 5, highlighting the importance of formally accounting for this periodic structure 
when dealing with intraday returns. Also notice the time-variation in the uncertainty over 
the seasonal component. 
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Announcement Effects 
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Figure 4: Posterior means and 95% intervals for the announcement effects on = (an, . . . ,0^5) for 
the SVt2 model. Results are shown on the standard deviation scale A = exp(a/2). 

Figure [4] summarizes the most important announcements for the SVt2 model (the other 
models are similar). Monthly payrolls has the largest impact, with more than 5 times 
baseline volatility for the period after the release. The FOMC announcement is the next 
most important announcement, as volatility is almost 4 times the base level. The rate of 
decrease for the FOMC announcement is also slower than for monthly payrolls, consistent 
with a greater digestion time. Quarterly GDP, the CPI index, the open of trading on Sunday 
night, and Durable Goods orders are the next most important announcements. The other 
announcements have significant increases, though smaller and are not reported. 
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3.4 Sample paths 



To understand the state variables, we plot the slow- moving stochastic volatility factor for the 
entire in-sample period, and all of the latent factors during three remarkable weeks during 
the crisis, including the week when the U.S. House of Representatives voted down the initial 
TARP proposal. 

Figure [5] plots daily returns, daily RV, and a summary of the posterior distribution of 
the slow volatility aX^i- The first spike in volatility occurred in August 2007, with the 
panic in short-term lending markets. The next spikes in volatility came after the FOMC 
announcement in January 2008, and the Bear Stearns takeover by J. P. Morgan in March 
2008. Markets calmed down until Fall 2008, when the crisis elevated volatility to its highest 
levels, with Xt t \ remaining high throughout the Fall of 2008 and early parts of 2009. To get 
a sense of the units, aX t i on annualized scale in October 2008 was about 60%. 

To understand how the individual components capture market movements during the 
financial crisis, we consider two particularly volatile weeks: 9/14/2008 to 9/19/2008 and 
9/28/2008 to 10/3/2008. During the week of 9/14/2008, Lehman Brothers filed for bankruptcy 
on 9/14; a large money market fund 'broke the buck,' with its share price falling below $1, 
AIG was bailed out, the FOMC, in a stunning move, did not move to cut interest rates fur- 
ther (a decision reversed shortly), and Bank of America announced their purchase of Merrill 
Lynch on 9/16; the SEC banned short-selling of financial stocks on 9/18, the Federal Reserve 
created a fund to loan money to banks to purchase asset backed commercial paper and also 
announced plans to purchase agency debt from primary dealers on 9/19. During the week of 
9/28, the main event was on 9/29, when the US House of Representatives rejected legislation 
authorizing TARP. 9/29 was the 3rd largest daily fall (-8.5%) for the S&P 500 index. 

Figure [6] summarizes the smoothed state variables for 9/14 to 9/19 for the SVCJ2 model. 
The Sunday night overnight return was -2.75%, as markets opened dramatically lower on 
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Figure 5: Daily returns, realized volatility, and smoothed means and 95% intervals for the slow 
volatility factor aX 1 for the SVt 2 model, March 2007-March 2009. 
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Figure 6: Prices, returns, smoothed volatility components (total volatility, slow volatility, fast 
volatility, volatility jumps, seasonal and announcement components) and absolute residuals during 
the week of September 14-19, 2008 for the SVCJ2 model. Each panel contains posterior means, 
and the bands represent 95% posterior intervals. The second panel from the bottom summarizes 
the seasonal fits on the left-hand axis and announcements on the right hand axis. 
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Figure 7: Prices, returns, smoothed volatility components (total volatility, slow volatility, fast 
volatility, volatility jumps, seasonal and announcement components) and absolute residuals during 
the week of September 28-October 3, 2008 for the SVCJ2 model. Each panel contains poste- 
rior means, and the bands represent 95% posterior intervals. The second panel from the bottom 
summarizes the seasonal fits on the left-hand axis and announcements on the right hand axis. 
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the Lehman news. The model captures this move via a large jump and elevated levels of 
intraday and interday volatility. Interday volatility was more than twice its long run average 
throughout the week. On 9/16, an FOMC announcement occurred at 14:15 generating 
enormous moves: there were 3 5-minute periods where S&P futures moved more than 1%. 
Despite the already high announcement effect volatility, the model needed a large jump in 
volatility to generate these huge moves. Later that day after the close of normal trading, 
there were additional jumps corresponding to the Merrill Lynch merger. The large moves 
on 9/18 were associated with rumors and the subsequent announcement of the short-selling 
ban on financial stocks, as S&P futures moved from 1140 to almost 1240 overnight. 9/19 
was relatively quiet. 

Next, Figure [7] summarizes the week of 9/28. On 9/29, the S&P to dropped 50 points 
in a minute at approximately 12:45 p.m. when markets realized the legislation would not 
pass. There were multiple periods with 5-minute absolute returns greater than one or even 
two percent. This can be clearly seen in Figure [7| The SVCJ2 model captured this event 
through a combination of high interday volatility (X t l was twice its long run average), large 
jumps in volatility and extremely high intraday volatility (at times more up to eight times 
its historical average). Friday afternoon, the S&P dropped almost 5% into the close on bank 
solvency rumors. Notice in the bottom panel the absence of any outlier residuals. 

These results show the key role played by jumps in volatility, capturing the impact of 
unexpected news arrivals by temporarily increasing volatility. In the SVt2 model, large 
outlier shocks generated by the t-distributed errors play a prominent role in explaining these 
large moves. Diffusive volatility is not able to increase rapidly enough to capture extremely 
large movements. 
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3.5 Out-of-sample forecasting 



Conditional on posterior means for the parameters at the end of the in-sample period, we 
run particle filters for the out-of-sample period from March 2009 to March 2012. This period 
provides three handicaps to successful forecasting: the in-sample period is relatively short 
compared to the out-of-sample period; the out-of-sample period had lower overall volatility; 
and we do not update the parameters. 

To compute forecasts, we simulate p ^y t+T |0(j), y*, A4^j using the output of particle filter 
and standard Monte Carlo methods to simulate future returns. We focus on two horizons: 
t = 12 corresponding to hourly forecasts and r = 279 corresponding to daily forecasts. To 
compare our methods to the extant literature, we focus on predicting realized variance, which 
is just the sum of squared returns over the relevant horizon. We report multiple volatility 
forecast metrics including the bias, mean absolute error (MAE) and root mean-squared error 
(RMSE), which are given by 

BIAS = Y,( RV »,r-W 9 , T ) 

s 

MAE = ^2\RV 8 , T -RV 8i r\ 



RMSE 



<J2 (RVs,r - RV S , T ) , 



where s indexes the number of daily (hourly) forecasting periods, RV SyT is the model implied 
predictive volatility over r 5-minute periods using information at time s, and RV StT is the 
subsequently realized 5-minute volatility: 



RV a , T 



\ 



S + T 

t=s+l 
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Summary 
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Table 5: Hourly and Daily RV Forecasting, March 2009-March 2012. 



We also run Mincer- Zarnowitz regressions on volatility levels: 

RV s ,r = b + b 1 RV SjT + e SjT , 



and report R 2 values for the regressions. Higher R 2 values indicate better forecasts. 

In contrast to a literature focussing on volatility forecasts, we evaluate the fit of the entire 
return distribution via out-of-sample predictive log-likelihoods, which provide a comprehen- 
sive measure of fit, as they measure the ability to predict the entire distribution, instead of a 
specific moment. These are also summarized via various empirical coverage probabilities of 
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the tails of the predictive distribution of realized volatility. For example, 1% of the time, the 
observed RV should be smaller than the 1 st quantile of the predictive distribution. We report 
empirical coverage probabilities for the 1 st , 5 th , 95 th , and 99 th quantiles of the predictive RV 
distribution. As comparisons, we report results for the long-memory autoregressive model 



specification fit to daily RV as in Andersen et al. (2003) (AR-RV), and three GARCH models 
fit to the deseasonalized 5-minute returns: a GARCH(1,1) with normal and t errors, and a 
threshold GARCH(1,1) with t errors. 

The results are summarized in Table [5] In terms of the benchmark daily horizon, all of 
the two-factor models have a small negative bias, which should not be viewed as a problem 
as estimators with small biases often perform well out-of-sample. Documenting two-factor 
model performance out-of-sample is important, as one might suspect that these models 
provide a better in-sample fit, but perform poorly out-of-sample. That is not the case. 
Although not reported to save space, the two-factor models always perform better than 
their single-factor counterparts for forecasting volatility, with lower MAE and RMSE and 
higher i? 2, s. The improvement in i? 2 's is about 8% when moving from one to two factors. 
In terms of the specific two-factor models, they perform broadly similarly, with the ASV2 
having the lowest MAE and RMSE and the SVCJ2 model the highest R 2 statistic. The 
differences are slight. The small differences between two-factor models and the uniform 
improvement from one-factor models is important and it shows that the models with the 
most static parameters, the SVJ2 and SVCJ2 models, were not overfit in-sample, as they 
perform well out of sample. 

Importantly, all of our two-factor models beat the benchmark GARCH (1,1) model, which 



provides an affirmative answer to the provocative paper by Hansen and Lunde (2005) titled 
"A forecast comparison of volatility models: does anything beat a GARCH(1,1)?" Our 
model-based forecasts also outperform the AR-RV forecasts, indicating that model based 
forecasts are competitive with the best approaches based on reduced form RV forecasts. The 
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differences are smaller for hourly forecasts, but provide a similar relative ranking. 

The two-factor models always cover the tail probabilities more accurately than their 
single-factor counterparts, additional evidence for the usefulness of the two-factor models. 
In general, the empirical probability of seeing RV smaller than the I s * quantile is slightly 
greater than 1%. This is consistent with difficulties in modeling the left tail of the RV 
distribution. Overall, the coverage probabilities are close to their theoretical values. The 
two-factor models also generally provide more accurate tail fits than the GARCH models or 
AR-RV model. 

As a final metric, Figure [2Jd reports out-of-sample log-likelihood ratios for the two-factor 
models and intraday GARCH models (relative to the base SVi model). These provide an 
overall measure of model fit, as they are based on the entire predictive distribution and all of 
the realized 5-minute returns. The results are very similar to the in-sample results, with the 
SVt 2 and SVCJ 2 models providing the best out-of-sample fit. The GARCH models provide 
a particularly poor out-of-sample fit to the entire return distribution. 

4 Discussion 

This paper develops multifactor SV models of around the clock high-frequency equity index 
returns. These models, more general than any in the literature, contain features previously 
documented in the literature using both high-frequency intraday data and lower-frequency 
daily data. We estimate the models directly using MCMC methods and use particle filtering 
methods for forecasting and model evaluation. Our approach provides a complete toolkit 
for estimation, inference and forecasting. We estimate the models using 5-minute S&P 500 
futures data from the financial crisis. 

Our results are summarized as follows. First, in addition to the importance of announce- 



28 



ments and seasonal factors, we find strong evidence for multiple persistent volatility factors. 
The slow-moving interday factor is highly persistent with a shock half life of roughly 25 
days. This is consistent with previous estimates based on lower frequency data. The shocks 
to the rapidly moving component have a half-life of roughly a hour, a clear sign of multiscale 
volatility. Second, fat-tailed shock models, either via t— distributed return shocks or jumps 
in returns and volatility, perform best. Outliers capture the extreme tail behavior of high- 
frequency returns and are crucial components, especially during periods of crisis. Third, the 
slow-moving interday volatility factor explains the largest portion of volatility movements, 
more than 50% in most models, followed by the periodic component. The rapidly moving 
factor and announcements are both significant but play a lesser role. 

Fourth, in jump models, jump intensity estimates are relatively high and jump sizes 
are modest, at least when compared to the daily literature. Jumps arrive around once per 
day, and their volatility is about 5 to 10 times the unconditional 5-minute return standard 
deviation, thus jumps are 'big.' However, the sizes are relatively small compared to previous 
estimates using daily data. Although our sample contains some of the largest index moves 
ever observed in the U.S. history, these were not large discontinuous moves, but rather a 
large number of modest moves in the same direction. 

Using the smoothed state variables, we provide a detailed analysis of some of the most vi- 
olent periods of the financial crisis, decomposing return volatility during the week of Lehman 
Brothers' bankruptcy and the week when markets crashed after the TARP legislation vote 
failed. These periods document exactly how these complicated models deal with periods of 
extreme stress, highlighting the role of the intraday volatility factor and jumps in volatility. 

Finally, we provide an extensive forecasting exercise. We implement forecasts at the 
hourly and daily frequencies for each model, and compare, where relevant, our forecasts to 
standard models. Regressing RV on predicted volatility, we find out-of-sample i? 2 's as high 
as 73% for the multiscale models, which also always outperform the literature benchmarks 
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and their single-factor counterparts. We also report out-of-sample log-likelihoods, which 
provide a metric accounting for the entire distribution and tail coverage probabilities. In 
both cases, we find that the best in-sample models, the multiscale models with jumps in 
returns and volatility or t-distributed errors, perform well out-of-sample. 

Although most of the literature aggregates intraday returns to daily measures of realized 
variance, our results suggest that formally modeling intraday returns is quite useful, both for 
understanding the components of equity returns and for practical applications like forecast- 
ing. Importantly, all of conclusions hold for both in- and out-of-sample metrics. These results 
imply that formal statistical models can be quite useful, both for understanding volatility 
and its components and for practical financial applications, as the model based forecasts are 
more accurate than standard benchmark forecasts. 

There are many potential extensions and applications to this work. On the theoretical 
side, it would be interesting to build models that account for the discreteness of price changes 
and allow for additional seasonality (day of the week effects, holidays, option-expiration, 
etc.). On the empirical side, we are working on additional studies to understand volatility 
components during the financial crisis and flash-crash in May 2010, high-frequency volatility 
in currency, commodities and fixed income markets, and to use our models for option pricing 
applications. 
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Appendix A: Model and Priors 



The general two-factor model can be written as 



Log Returns 


y t = fi + exp(h t /2) \J \ t e t + J t Zf 


Total Volatility 


h t = Hh + %t,i + x t ,2 + Flf3 + H' t a 


Slow Volatility 


X t ,l = 01^-1,1 + (TlUt-1,1 


Fast Volatility 


X t ,2 = faXt-1,2 + 0"2 (p£t-l + V 1 ~ 


Daily Seasonal 


/3~AT(0,r s 2 [/ s )-l(l , /3 = 0) 


Announcements 


a, ~ M(0,T^U a ), i = 1, ... ,n 


Scale Factors 


A t ~J0(i//2,i//2) 


Jump Times 


J t ~ Bern(n) 


Return Jumps 




Volatility Jumps 





Here, U s and U a are correlation matrices implied by the smoothing spline priors, as defined in 
Appendix C. We assume the following prior distributions for the parameters: fi ~ A/"(0, 1), 
Hh ~ AT(-6.2,1), 0i ~ £*(20,1.5) for % = 1,2 (with : > 2 ), cr 2 ~ X£(.001, .001) for 
i = 1,2, v ~ £>W(2,128), p ~ W(-l,l), « ~ 5(1,1000), (fi y , crj) ~ AfZ<?(0, 1, 10, .324), 
(^,a 2 ) ~ A/X£(.50,10,10,l), and r 2 ~ X£(.001, .001), for i = s, a. Here B* denotes the 
transformed Beta distribution from Chib et al. (2002), VIA is the discrete uniform, and MXQ 



is the normal-inverse gamma distribution. 



35 



Appendix B: Auxiliary Mixture Model 



The volatility states and parameters are updated using the mixture approximation of |Omori 



et al. (2007). Conditional on //, X t , J t , Z\, the returns are transformed to (y^,d t ), where 



y* = log ( - — fJ ^- JtZt + const^j , d t = sign (y t - /i - J t Z v t ) 



and const = .0001 is used to avoid logs of zeros. We then write the return equation as 



y* = ht + \og(e 



tii 



and approximate the joint distribution of (t = log(e 2 ) and rjtfi by a mixture of 10 normals: 



10 



p(Ct,Vt,2\dt,P^2) = Pj^ l '(Ct\m j ,Vj)J\f(T] tj 2\d t p(a* +b*( t ),l -p 2 ), 

3=1 



where (pj,rrij } Vj } a*,b*) } j = 1, . . . , 10 are constants given in Omori et al. (2007). We then 
introduce a set of mixture indicator variables Ut G {1, . . . , 10} for t — 1, . . . , T. Conditional 
on the indicators, the model has a linear Gaussian state-space form, and the FFBS algorithm 
is used to generate the volatility states and parameters. 



Appendix C: Cubic Smoothing Splines 

To estimate the seasonal and announcement effects, we use the state-space framework for 
smoothing splines of Kohn and Ansley (1987). Let g — (gi, . . . , g%) denote the unknown coef- 
ficients, which have a (modified) cubic smoothing spline prior of the form V 2 gk ~ A/"(0, c\r 2 \ 
where are known constants and r 2 is an unknown smoothing parameter. We observe data 
y k ~ Af(gk,vl) for k = 1,2, ... ,K, where v\ are known. Defining the state vector as 
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x k = {9k,9k)', the model can be written in state-space form as 

Vk = h'x k + e k , e k ~ J\f(0,vl) 
x k+1 = Fx k + u k , u k ~ Af(0,T 2 c 2 k U), 



where h = (1, 0)', 



1 1 \ / 1/3 1/2 



o i J y 1/2 l 

e k and u k are serially and mutually uncorrelated errors, and X\ ~ A/"(0, c\ I) with c\ large. 
Defining x = (xi, . . . ,xk) and y = (yi, . . . ,yx), and assuming the prior r 2 ~ p(t 2 ), the 
posterior distribution of interest is 

K 



p(x,r 2 \y) oc p(r 2 ) Y[p(y k \x k )p(x k+1 \x k ,r 2 ). 



k=i 

We use a Metropolis step to generate x and r 2 jointly from this distribution. Conditional 
on the current value, r 2 ^, draw r 2( *) ~ N \r 2(y% \ w) , and accept with probability 

. /, p{y\r 2 ^)p{r 2 ^) 
mm < 1, 



P(l/k 2 «)p(r 2 ») 



Here p(y\r 2 ) is computed using the Kalman filter. If the draw is accepted, set r 2 ^ +1 ^ = r 2< *) 
and generate ~ p(x\r 2 ^ +1 \y) using the FFBS algorithm. Otherwise leave x unchanged. 
Since x k = (g k , g k ), draws of the function g = (g 1 , . . . , g K )' are obtained directly from x. 

The degrees of freedom for the fit is obtained by noting that the posterior mean of the 
function, conditional on r 2 , has the form E(g\y,r 2 ) = Ay, where A is the so-called 'hat- 
matrix.' The degrees of freedom is defined as d = tr(A). Following Ansley and Kohn (1987), 
this value is computed efficiently using a modified Kalman filter algorithm. 
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Appendix D: MCMC Algorithm 

The joint posterior distribution for the model in Appendix A is 

p(x, A, J, Z, (3, a, 9* \y) oc p(y\x, A, J, Z, /3, a, 0*)p(x, A, J, ) p((3\0*) p(a\9*)p(9*) 

where x t = (x t:1 ,x tj2 ), Z t = (Z?,Zf), y = (y 1 ,...,y T ), x = (xi, . . . , x T ), A = (Ai,...,A T ), 
J = (Ji,...,J T ), Z = {Z 1 ,...,Z T ), and 6* = (p,, p, h ,<f> 1 ,<f> 2 ,ai,a 2 ,p,i', k, p y ,a y , p v ,a v ,T s ,r a ). 

The models were estimated using the Markov chain Monte Carlo algorithm described below. 
We ran the MCMC for 12500 iterations and discarded the first 2500 as burn-in, leaving 
10000 samples for posterior inference. The starting values were set to the prior mean (or 
mode), although we found that the results were robust to this choice. The MCMC algorithm 
consists of the following steps: 

1. Draw p(cu\y*,x, J, Z y ,Z v ,\, 9*) 

2. Draw p(x, p h , (f) 1 , <f> 2 , <7i, cr 2 , p\y*, u, f3, a) 

3. Draw p(f3,r]\y*,u,x,a,9*) 

4. Draw p(a,r^\y* ,u, x, (3,9*) 

5. Draw p(X, v\y, x, J, Z y , (3, a, p) 

6. Draw p(J, Z y , Z v \y, x, A, k, p y , a y , p v , a v ,p) 

7. Draw p(n, p y , cr y , p v , a v \J,Z y ,Z v ) 

8. Draw p(p\y, x, A, J, Z y , (3, a) 

1. Sampling u t . The indicators u> t are independent multinomials with probabilities 

Pr(w t = j\Ct, Vt,2, p) oc pj <f)(( t ; rrij, v]) 0(r^ 2 ; d t p(a* + b*( t ), 1 - p 2 ) : j = 1, . . . , 10. 
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2. Sampling (x, p,h, <pi, 02, &i, &2, p)- Conditional on y\ , u t , a t , s t , J t , Z% , the model can 
be written in state-space form: 



Vt = Ph + x t ,i + x t ,2 + v m u t ,x 
Xt+1,1 = 4>\x t> i + crxu tfl 

Xt+1,2 = foxti + C7 2 (d t p(a* t + bl t u tjl ) + y/l - p 2 u ti3 ) + J t Z" t 
where y t — y\ — s t — a t — m Ut , and (u t i,Ut2, Uts)' ~ A/" (0,1). We use the method of 



Omori et al. (2007) to draw (x, /i/j, <f>x, &i, 4>2, 02, p) as a block from the full conditional. 
To update the parameters (<f>i, fa, &i, 02, p), we use a Metropolis step. The proposal 
distribution is a truncated multivariate normal with a covariance matrix chosen to 
achieve an acceptance probability of around 30%. 

3. Sampling r s ). Conditional on the other states and parameters, we cast the model 
in state-space form by defining the state vector as /3| = ((3k,(3k)', and writing 

y k = h'/3* k + e k , e k ~N(p,vl) 
Pl+i = Ff3* k + u k , u k ~M(0,c 2 k T]U); k — 1, . . . ,288, 

where y k = v\ J2{t.F tk =i}(y*t ~ » ~ x t,i ~ x t ,2 -<h- m ut )v~ 2 , v\ = (J2{f.F tk =i} V Z^Y\ 
and 

r 100 if k = 1,25,109,187,265,271; 
1 otherwise, 



C'k 



are variance inflation factors to allow for discontinuities at selected market open- 
ing/closing times. We then use the Metropolis algorithm from Appendix C to generate 
(P,Tg) as a block. Given the simulated states, we impose the zero-sum constraint on 
the seasonal coefficients by defining f3 k = (3 k — (X^it=i AO/288. 
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4. Sampling (a, r a ). For each announcement type i — 1, . . . , n, the model is written in 
state-space form by defining the state vector as a* k = (a^k, cti,k)' and writing 

y ik = h'a* ik + e ik , e ik ~ Af(0, v 2 ik ) 
«* fc+ i = Fa* ik + u ik , u ik ~ A/"(0,t s 2 [7); fc = l,...,5, 

where y ifc = f)? fc E {i: H ilfc =i}(%*-^-^,i-^,2-St-^ t )^ 2 andu^ = (E{t:ff tifc =i } ^ 2 ) _1 - 
We then use the Metropolis algorithm from Appendix C to jointly update (a, r 2 ). 

5. Sampling (A, v). Write the joint posterior as p(Xu\...) = p(u\...)p(X t \u, ...). To up- 
date the degrees of freedom, if we define w t = (yt — p> — JtZf)/V t , then the model is 
(w t \v, •••) ~ t u (0 : 1). Under the discrete uniform prior v ~ VIA {2, 128), the posterior is 
a multinomial distribution (v\w, ...) ~ M-ij^i • • • ■> n i2%)i with probabilities 

T 

<oc Y[pu(w t ), is = 2,..., 128, 

where p^(-) denotes the Student-t density with z/ degrees of freedom. Rather than com- 
pute each of the multinomial probabilities, which is quite costly, we use a Metropolis 
step to update v. Given the current value we draw a candidate value i/*) ~ 
VU(v^ — 5, i/W + 5), and accept with probability 

min{l,n^!lM\. 

The width 5 is chosen to give an acceptance probability between 20% and 50%. 

To update the scale factors, define e* t = (y t — p — J t Zf)/y/V t . Then (e* t \X t , v, ...) ~ 
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A/(0, \ t ). Combining this with the prior, (A t |z/) ~ TQ(u/2, v/2), the full conditional is 

(A t | !/,...) ~ Z£ 

6. Sampling ( J, Z 2 ', Z 1 '). Given the other states and parameters, write the model as 
(w t \J t , Z t , ...) ~ Af(J t Z t , £ t ), where 



^+1,2-02^4,2 / \ Z t V P^a-v/W (i-p 2 ) ": 



Assuming conjugate priors, J t ~ £>ern(/t) and Z t ~ J\f((j, z ,Y, z ), where /z 2 = (fj, y ,fi v )' 
and S 2 = diag(cr^, (T^), the full conditionals for the jump times and sizes are 

P(j t = i| ^- ^K;^s t + s z ) 



(1 - k) (iu t ; 0, St) + k (j) (w t ; fx z , E t + S z ) 

(z t | j t = 1, ...) ~ A/" ((s; 1 + s-^-^ejV, + srV), (s; 1 + s- 1 )- 1 ) . 

7. Sampling (k, /i y , a y , /i v , a v ). Under the conjugate priors k ~ B(a K , b K ), and (/i y , a 2 ) ~ 
J\fIQ(m y ,c y ,a y ,b y ), and (/x„,cr^) ~ ftfIG(m v ,c v ,a v ,b v ), the full conditionals for the 
jump parameters have a closed form 

(«|...) ~ BialX) 

(n y ,a 2 y \...) ~ ^(m;, c ;,a;,6p 
(^,^ 2 |...) ~ ^K, c ;,a;,&;) 
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where for j — y, v, 



Ca = 



j> 



c*m* = Cjirij + 



T 



V j t z> 



T 



+ c 3 m] + Y Jt JJtZl f - c 



*2 

3 m 3 ■ 



Sampling /i. Under the prior distribution, \x ~ jV(m M , the full conditional for the 
mean return is 



1 1 \ (m„ t y* t \ ( I 1 



i /... * \ / i 1 \ -i 



where V^* = (1 — p 2 )A t \4 and 

Vt=Vt- J tZ y t - pV^tVt I — I . 



Appendix E: Auxiliary Particle Filter 

We describe a general auxiliary particle filter used for the models in the paper. Assume 
the parameters 9 = a, 9*) are fixed at their posterior means. Write the state vector as 
z t = (xt, Zf), where Xt are the stochastic volatility states and z\ are the other state variables in 
the model. The goal is to sample from the filtering distribution, p{z t \y t ) = p(x t \y t )p(z^\x t , y l ), 
for t — 1, . . . , T, where the first distribution on the right hand side is unavailable analytically, 
and second is available in closed form. 

Assume we have an equally- weighted sample available, z^l x ~ p{z t -i\y t ~ 1 ) ., % — 1, . . . , N, 
at time t — 1. The goal is to sample from p(k, x t |y*) oc p(k\y t ~ 1 )p(x t \z^ k i\, y t ~ 1 )p(yt\xt) , where 
k is the auxiliary mixture index. To do this, we first sample the index k ~ and then 
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the state, x t ~ q(x t \k,y t ), where 

q{k\y t ) oc p (y t \xf ) 
q{x t \k,i?) = pfa^y*- 1 ), 

and x^ = E(x t \z^ 1} y t ~ 1 ). We then resample the states with weights 

p(yt\4 ] ) 
w t 00 — f — — ^ 

to obtain samples from the posterior distribution, ~ p(z t \y t ). 

This leads to the following APF algorithm for t — 1, 2, . . . , T and % = 1, . . . , N: 



1. Start with a sample z^ = ^t-ij^t-ij ~ x ). 

2. Compute 7T^ oc p ^/t|£^ J , where x^f 1 = E \ Xt\z^l\, y 1 ' 1 



3. Generate k l ~ .M 1 7r f l , . . . , n t 



(i) _.W 
j • • • j 71 ( 

..t-i 



4. Generate ~ p f^tki-ij?/* 

5. Compute iw^ oc p ^/t|a^J / 7r^ fc \ 

6. Generate j 1 ~ .M ^iu^, . . . , w^J and set = xf ^ 

7. Generate z* t ^ ~ p ^\xf\ y*V 



Following Malik and Pitt (2012), the likelihood function for a fixed parameter value 6 can 



be approximated using the output from the auxiliary particle filter as 



^)=n(^E^) (it 

t=\ \ i=i / \ i=i 
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We illustrate the APF algorithm for the SVCJ 2 model. Here we define x t = (x tj i,x tj2 ), z\ = 
(Jt, Z\ , V t = exp(/i t ), and e t — (yt — p — JtZf) j ' \fVu and the conditional distributions 
of interest are 



p(x t \z t _ 1 ,y t x ) = M 



<PiXt-is | [ ^ ° 

02^-1,2 + (7 2 pS t -l + Jt-l^tLi / \ cr|(l - p 2 

P(yt\x t ) = (I - k) (f)(y t ; p,V t ) + K(f)(y t ; p + fi y ,V t + a 2 ) 
p(J t = l\x u y t ) = K<p (y t ; p + p y ,V t + a%) /p(y t \x t ) 
p(Z?\J t ,x t ,yt) = M ((^ + g)" 1 (*J + ^p)) , 



T 2 



p( z t\ J t, x t , y*) = M (pv, ol) 



Appendix F: Realized Volatility Forecasting 

Conditional on fixed parameter values and posterior samples of the state vector at time s, 
z^ ~ p(z s \y s ), i = 1, • • - ,N, the forecast distribution of realized volatility is obtained by 
forward simulation. The steps to generate RV over a r-period horizon are as follows. For each 
time t — s + 1, . . . , s + r, generate states zf^ ~ p(z t \zf} 1 ) and future returns y^ ~ p(y t \z^) 
for i = 1, . . . , N. Samples from the forecast distribution of realized volatility are obtained as 



(») 

RV\ 



S + T 2 



The point forecast of RV is the forecast mean: 



t=s+l 



N 

i=i 
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Appendix G: List of Announcements 



No. 


Announcement Type 


Frequency 


Days 


Time 


1 


ADP Employment 


Monthly 


Wed-Thu 


8:15 


2 


Jobless Claims 


Weekly 


Thu 


8:30 


3 


Consumer Price Index 


Monthly 


Wed-Fri 


8:30 


4 


Durable Goods 


Monthly 


Wed-Fri 


8:30 


5 


GDP Advance 


Quarterly 


Thu-Fri 


8:30 


6 


Monthly Payrolls 


Monthly 


Fri 


8:30 


7 


Empire State Manuf. 


Monthly 


Mon-Fri 


8:30 


8 


Consumer Confidence 


Monthly 


Tue-Wed 


10:00 


9 


Philadelphia Fed 


Monthly 


Thu 


10:00 


10 


ISM Manufacturing 


Monthly 


Mon-Fri 


10:00 


11 


ISM Services 


Monthly 


Tue-Fri 


10:00 


12 


FOMC Minutes 


8 /year 


Tue-Wed 


14:00 


13 


FOMC 


8 /year 


Tue-Wed 


14:15 


14 


Sunday Open 


Weekly 


Sun 


18:00 



Table 6: List of the major US macroeconomic announcements that we incorporate, along with 
frequency, day of the week, and time of the day (ET) on which the announcements occur. 
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